undefinedundefined undefinedundefinedundefinedundefined

Overview

Dataset statistics

Number of variables 10
Number of observations 53940
Missing cells 0
Missing cells (%) 0.0%
Duplicate rows 143
Duplicate rows (%) 0.3%
Total size in memory 4.1 MiB
Average record size in memory 80.0 B

Variable types

Numeric 7
Categorical 3

Dataset

Description Statistical Exploration of Diamonds Features
URL
Copyright (c) Mr. Eslam Fouad 2023

Variable descriptions

Price price in US dollars (\$326--\$18,823)
Carat weight of the diamond (0.2--5.01)
Cut quality of the cut (Fair, Good, Very Good, Premium, Ideal)
Color diamond colour, from J (worst) to D (best)
Clarity a measurement of how clear the diamond is (I1 (worst), SI2, SI1, VS2, VS1, VVS2, VVS1, IF (best))
x length in mm (0--10.74)
y width in mm (0--58.9)
z depth in mm (0--31.8)
Depth total depth percentage = z / mean(x, y) = 2 * z / (x + y) (43--79)
Dable width of top of diamond relative to widest point (43--95)

Alerts

Dataset has 143 (0.3%) duplicate rows Duplicates
carat is highly overall correlated with price and 3 other fields High correlation
price is highly overall correlated with carat and 3 other fields High correlation
x is highly overall correlated with carat and 3 other fields High correlation
y is highly overall correlated with carat and 3 other fields High correlation
z is highly overall correlated with carat and 3 other fields High correlation

Reproduction

Analysis started 2023-06-13 11:56:10.313663
Analysis finished 2023-06-13 11:56:24.827613
Duration 14.51 seconds
Software version pandas-profiling v3.6.6
Download configuration config.json

Variables

carat
Real number (ℝ)

Distinct 273
Distinct (%) 0.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 0.79793975
Minimum 0.2
Maximum 5.01
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:24.956113 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 0.2
5-th percentile 0.3
Q1 0.4
median 0.7
Q3 1.04
95-th percentile 1.7
Maximum 5.01
Range 4.81
Interquartile range (IQR) 0.64

Descriptive statistics

Standard deviation 0.47401124
Coefficient of variation (CV) 0.59404391
Kurtosis 1.2566353
Mean 0.79793975
Median Absolute Deviation (MAD) 0.32
Skewness 1.1166459
Sum 43040.87
Variance 0.22468666
Monotonicity Not monotonic
2023-06-13T11:56:25.194999 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
0.3 2604
 
4.8%
0.31 2249
 
4.2%
1.01 2242
 
4.2%
0.7 1981
 
3.7%
0.32 1840
 
3.4%
1 1558
 
2.9%
0.9 1485
 
2.8%
0.41 1382
 
2.6%
0.4 1299
 
2.4%
0.71 1294
 
2.4%
Other values (263) 36006
66.8%
Value Count Frequency (%)
0.2 12
 
< 0.1%
0.21 9
 
< 0.1%
0.22 5
 
< 0.1%
0.23 293
0.5%
0.24 254
0.5%
0.25 212
0.4%
0.26 253
0.5%
0.27 233
0.4%
0.28 198
0.4%
0.29 130
0.2%
Value Count Frequency (%)
5.01 1
< 0.1%
4.5 1
< 0.1%
4.13 1
< 0.1%
4.01 2
< 0.1%
4 1
< 0.1%
3.67 1
< 0.1%
3.65 1
< 0.1%
3.51 1
< 0.1%
3.5 1
< 0.1%
3.4 1
< 0.1%

cut
Categorical

Distinct 5
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 421.5 KiB
Ideal
21551 
Premium
13791 
Very Good
12082 
Good
4906 
Fair
 
1610

Length

Max length 9
Median length 7
Mean length 6.2865035
Min length 4

Characters and Unicode

Total characters 339094
Distinct characters 16
Distinct categories 3 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row Ideal
2nd row Premium
3rd row Good
4th row Premium
5th row Good

Common Values

Value Count Frequency (%)
Ideal 21551
40.0%
Premium 13791
25.6%
Very Good 12082
22.4%
Good 4906
 
9.1%
Fair 1610
 
3.0%

Length

2023-06-13T11:56:25.399175 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-13T11:56:25.607088 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Value Count Frequency (%)
ideal 21551
32.6%
good 16988
25.7%
premium 13791
20.9%
very 12082
18.3%
fair 1610
 
2.4%

Most occurring characters

Value Count Frequency (%)
e 47424
14.0%
d 38539
11.4%
o 33976
10.0%
m 27582
8.1%
r 27483
8.1%
a 23161
 
6.8%
I 21551
 
6.4%
l 21551
 
6.4%
G 16988
 
5.0%
i 15401
 
4.5%
Other values (6) 65438
19.3%

Most occurring categories

Value Count Frequency (%)
Lowercase Letter 260990
77.0%
Uppercase Letter 66022
 
19.5%
Space Separator 12082
 
3.6%

Most frequent character per category

Lowercase Letter
Value Count Frequency (%)
e 47424
18.2%
d 38539
14.8%
o 33976
13.0%
m 27582
10.6%
r 27483
10.5%
a 23161
8.9%
l 21551
8.3%
i 15401
 
5.9%
u 13791
 
5.3%
y 12082
 
4.6%
Uppercase Letter
Value Count Frequency (%)
I 21551
32.6%
G 16988
25.7%
P 13791
20.9%
V 12082
18.3%
F 1610
 
2.4%
Space Separator
Value Count Frequency (%)
12082
100.0%

Most occurring scripts

Value Count Frequency (%)
Latin 327012
96.4%
Common 12082
 
3.6%

Most frequent character per script

Latin
Value Count Frequency (%)
e 47424
14.5%
d 38539
11.8%
o 33976
10.4%
m 27582
8.4%
r 27483
8.4%
a 23161
7.1%
I 21551
6.6%
l 21551
6.6%
G 16988
 
5.2%
i 15401
 
4.7%
Other values (5) 53356
16.3%
Common
Value Count Frequency (%)
12082
100.0%

Most occurring blocks

Value Count Frequency (%)
ASCII 339094
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
e 47424
14.0%
d 38539
11.4%
o 33976
10.0%
m 27582
8.1%
r 27483
8.1%
a 23161
 
6.8%
I 21551
 
6.4%
l 21551
 
6.4%
G 16988
 
5.0%
i 15401
 
4.5%
Other values (6) 65438
19.3%

color
Categorical

Distinct 7
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 421.5 KiB
G
11292 
E
9797 
F
9542 
H
8304 
D
6775 
Other values (2)
8230 

Length

Max length 1
Median length 1
Mean length 1
Min length 1

Characters and Unicode

Total characters 53940
Distinct characters 7
Distinct categories 1 ?
Distinct scripts 1 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row E
2nd row E
3rd row E
4th row I
5th row J

Common Values

Value Count Frequency (%)
G 11292
20.9%
E 9797
18.2%
F 9542
17.7%
H 8304
15.4%
D 6775
12.6%
I 5422
10.1%
J 2808
 
5.2%

Length

2023-06-13T11:56:26.079373 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-13T11:56:26.289896 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Value Count Frequency (%)
g 11292
20.9%
e 9797
18.2%
f 9542
17.7%
h 8304
15.4%
d 6775
12.6%
i 5422
10.1%
j 2808
 
5.2%

Most occurring characters

Value Count Frequency (%)
G 11292
20.9%
E 9797
18.2%
F 9542
17.7%
H 8304
15.4%
D 6775
12.6%
I 5422
10.1%
J 2808
 
5.2%

Most occurring categories

Value Count Frequency (%)
Uppercase Letter 53940
100.0%

Most frequent character per category

Uppercase Letter
Value Count Frequency (%)
G 11292
20.9%
E 9797
18.2%
F 9542
17.7%
H 8304
15.4%
D 6775
12.6%
I 5422
10.1%
J 2808
 
5.2%

Most occurring scripts

Value Count Frequency (%)
Latin 53940
100.0%

Most frequent character per script

Latin
Value Count Frequency (%)
G 11292
20.9%
E 9797
18.2%
F 9542
17.7%
H 8304
15.4%
D 6775
12.6%
I 5422
10.1%
J 2808
 
5.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 53940
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
G 11292
20.9%
E 9797
18.2%
F 9542
17.7%
H 8304
15.4%
D 6775
12.6%
I 5422
10.1%
J 2808
 
5.2%

clarity
Categorical

Distinct 8
Distinct (%) < 0.1%
Missing 0
Missing (%) 0.0%
Memory size 421.5 KiB
SI1
13065 
VS2
12258 
SI2
9194 
VS1
8171 
VVS2
5066 
Other values (3)
6186 

Length

Max length 4
Median length 3
Mean length 3.1147571
Min length 2

Characters and Unicode

Total characters 168010
Distinct characters 6
Distinct categories 2 ?
Distinct scripts 2 ?
Distinct blocks 1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique 0 ?
Unique (%) 0.0%

Sample

1st row SI2
2nd row SI1
3rd row VS1
4th row VS2
5th row SI2

Common Values

Value Count Frequency (%)
SI1 13065
24.2%
VS2 12258
22.7%
SI2 9194
17.0%
VS1 8171
15.1%
VVS2 5066
 
9.4%
VVS1 3655
 
6.8%
IF 1790
 
3.3%
I1 741
 
1.4%

Length

2023-06-13T11:56:26.506129 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2023-06-13T11:56:26.796962 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Value Count Frequency (%)
si1 13065
24.2%
vs2 12258
22.7%
si2 9194
17.0%
vs1 8171
15.1%
vvs2 5066
 
9.4%
vvs1 3655
 
6.8%
if 1790
 
3.3%
i1 741
 
1.4%

Most occurring characters

Value Count Frequency (%)
S 51409
30.6%
V 37871
22.5%
2 26518
15.8%
1 25632
15.3%
I 24790
14.8%
F 1790
 
1.1%

Most occurring categories

Value Count Frequency (%)
Uppercase Letter 115860
69.0%
Decimal Number 52150
31.0%

Most frequent character per category

Uppercase Letter
Value Count Frequency (%)
S 51409
44.4%
V 37871
32.7%
I 24790
21.4%
F 1790
 
1.5%
Decimal Number
Value Count Frequency (%)
2 26518
50.8%
1 25632
49.2%

Most occurring scripts

Value Count Frequency (%)
Latin 115860
69.0%
Common 52150
31.0%

Most frequent character per script

Latin
Value Count Frequency (%)
S 51409
44.4%
V 37871
32.7%
I 24790
21.4%
F 1790
 
1.5%
Common
Value Count Frequency (%)
2 26518
50.8%
1 25632
49.2%

Most occurring blocks

Value Count Frequency (%)
ASCII 168010
100.0%

Most frequent character per block

ASCII
Value Count Frequency (%)
S 51409
30.6%
V 37871
22.5%
2 26518
15.8%
1 25632
15.3%
I 24790
14.8%
F 1790
 
1.1%

depth
Real number (ℝ)

Distinct 184
Distinct (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 61.749405
Minimum 43
Maximum 79
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:27.022526 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 43
5-th percentile 59.3
Q1 61
median 61.8
Q3 62.5
95-th percentile 63.8
Maximum 79
Range 36
Interquartile range (IQR) 1.5

Descriptive statistics

Standard deviation 1.4326213
Coefficient of variation (CV) 0.023200569
Kurtosis 5.7394146
Mean 61.749405
Median Absolute Deviation (MAD) 0.7
Skewness -0.082294026
Sum 3330762.9
Variance 2.0524038
Monotonicity Not monotonic
2023-06-13T11:56:27.254646 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
62 2239
 
4.2%
61.9 2163
 
4.0%
61.8 2077
 
3.9%
62.2 2039
 
3.8%
62.1 2020
 
3.7%
61.6 1956
 
3.6%
62.3 1940
 
3.6%
61.7 1904
 
3.5%
62.4 1792
 
3.3%
61.5 1719
 
3.2%
Other values (174) 34091
63.2%
Value Count Frequency (%)
43 2
< 0.1%
44 1
< 0.1%
50.8 1
< 0.1%
51 1
< 0.1%
52.2 1
< 0.1%
52.3 1
< 0.1%
52.7 1
< 0.1%
53 1
< 0.1%
53.1 1
< 0.1%
53.2 2
< 0.1%
Value Count Frequency (%)
79 2
< 0.1%
78.2 1
< 0.1%
73.6 1
< 0.1%
72.9 1
< 0.1%
72.2 1
< 0.1%
71.8 1
< 0.1%
71.6 2
< 0.1%
71.3 1
< 0.1%
71.2 1
< 0.1%
71 1
< 0.1%

table
Real number (ℝ)

Distinct 127
Distinct (%) 0.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 57.457184
Minimum 43
Maximum 95
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:27.496280 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 43
5-th percentile 54
Q1 56
median 57
Q3 59
95-th percentile 61
Maximum 95
Range 52
Interquartile range (IQR) 3

Descriptive statistics

Standard deviation 2.2344906
Coefficient of variation (CV) 0.038889664
Kurtosis 2.8018569
Mean 57.457184
Median Absolute Deviation (MAD) 1
Skewness 0.79689585
Sum 3099240.5
Variance 4.9929481
Monotonicity Not monotonic
2023-06-13T11:56:27.722526 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
56 9881
18.3%
57 9724
18.0%
58 8369
15.5%
59 6572
12.2%
55 6268
11.6%
60 4241
7.9%
54 2594
 
4.8%
61 2282
 
4.2%
62 1273
 
2.4%
63 588
 
1.1%
Other values (117) 2148
 
4.0%
Value Count Frequency (%)
43 1
 
< 0.1%
44 1
 
< 0.1%
49 2
 
< 0.1%
50 2
 
< 0.1%
50.1 1
 
< 0.1%
51 9
 
< 0.1%
51.6 1
 
< 0.1%
52 56
0.1%
52.4 1
 
< 0.1%
52.8 2
 
< 0.1%
Value Count Frequency (%)
95 1
 
< 0.1%
79 1
 
< 0.1%
76 1
 
< 0.1%
73 4
 
< 0.1%
71 1
 
< 0.1%
70 9
 
< 0.1%
69 9
 
< 0.1%
68 21
 
< 0.1%
67 42
0.1%
66 91
0.2%

price
Real number (ℝ)

Distinct 11602
Distinct (%) 21.5%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 3932.7997
Minimum 326
Maximum 18823
Zeros 0
Zeros (%) 0.0%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:27.951732 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 326
5-th percentile 544
Q1 950
median 2401
Q3 5324.25
95-th percentile 13107.1
Maximum 18823
Range 18497
Interquartile range (IQR) 4374.25

Descriptive statistics

Standard deviation 3989.4397
Coefficient of variation (CV) 1.014402
Kurtosis 2.1776958
Mean 3932.7997
Median Absolute Deviation (MAD) 1670
Skewness 1.6183953
Sum 2.1213522 × 10 8
Variance 15915629
Monotonicity Not monotonic
2023-06-13T11:56:28.177995 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
605 132
 
0.2%
802 127
 
0.2%
625 126
 
0.2%
828 125
 
0.2%
776 124
 
0.2%
698 121
 
0.2%
789 121
 
0.2%
544 120
 
0.2%
666 114
 
0.2%
552 113
 
0.2%
Other values (11592) 52717
97.7%
Value Count Frequency (%)
326 2
< 0.1%
327 1
< 0.1%
334 1
< 0.1%
335 1
< 0.1%
336 2
< 0.1%
337 2
< 0.1%
338 1
< 0.1%
339 1
< 0.1%
340 1
< 0.1%
342 1
< 0.1%
Value Count Frequency (%)
18823 1
< 0.1%
18818 1
< 0.1%
18806 1
< 0.1%
18804 1
< 0.1%
18803 1
< 0.1%
18797 1
< 0.1%
18795 2
< 0.1%
18791 2
< 0.1%
18788 1
< 0.1%
18787 1
< 0.1%

x
Real number (ℝ)

length in mm (0--10.74)

Distinct 554
Distinct (%) 1.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 5.7311572
Minimum 0
Maximum 10.74
Zeros 8
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:28.423905 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 4.29
Q1 4.71
median 5.7
Q3 6.54
95-th percentile 7.66
Maximum 10.74
Range 10.74
Interquartile range (IQR) 1.83

Descriptive statistics

Standard deviation 1.1217607
Coefficient of variation (CV) 0.19573023
Kurtosis -0.61816067
Mean 5.7311572
Median Absolute Deviation (MAD) 0.93
Skewness 0.37867634
Sum 309138.62
Variance 1.2583472
Monotonicity Not monotonic
2023-06-13T11:56:28.663472 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
4.37 448
 
0.8%
4.34 437
 
0.8%
4.33 429
 
0.8%
4.38 428
 
0.8%
4.32 425
 
0.8%
4.35 407
 
0.8%
4.39 388
 
0.7%
4.31 387
 
0.7%
4.36 386
 
0.7%
4.4 373
 
0.7%
Other values (544) 49832
92.4%
Value Count Frequency (%)
0 8
< 0.1%
3.73 2
 
< 0.1%
3.74 1
 
< 0.1%
3.76 1
 
< 0.1%
3.77 1
 
< 0.1%
3.79 2
 
< 0.1%
3.81 3
 
< 0.1%
3.82 2
 
< 0.1%
3.83 3
 
< 0.1%
3.84 4
< 0.1%
Value Count Frequency (%)
10.74 1
< 0.1%
10.23 1
< 0.1%
10.14 1
< 0.1%
10.02 1
< 0.1%
10.01 1
< 0.1%
10 1
< 0.1%
9.86 1
< 0.1%
9.66 1
< 0.1%
9.65 1
< 0.1%
9.54 1
< 0.1%

y
Real number (ℝ)

width in mm (0--58.9)

Distinct 552
Distinct (%) 1.0%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 5.734526
Minimum 0
Maximum 58.9
Zeros 7
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:28.894577 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 4.3
Q1 4.72
median 5.71
Q3 6.54
95-th percentile 7.65
Maximum 58.9
Range 58.9
Interquartile range (IQR) 1.82

Descriptive statistics

Standard deviation 1.1421347
Coefficient of variation (CV) 0.19916811
Kurtosis 91.214557
Mean 5.734526
Median Absolute Deviation (MAD) 0.92
Skewness 2.4341667
Sum 309320.33
Variance 1.3044716
Monotonicity Not monotonic
2023-06-13T11:56:29.122782 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
4.34 437
 
0.8%
4.37 435
 
0.8%
4.35 425
 
0.8%
4.33 421
 
0.8%
4.32 414
 
0.8%
4.39 407
 
0.8%
4.38 406
 
0.8%
4.4 387
 
0.7%
4.31 386
 
0.7%
4.41 384
 
0.7%
Other values (542) 49838
92.4%
Value Count Frequency (%)
0 7
< 0.1%
3.68 1
 
< 0.1%
3.71 2
 
< 0.1%
3.72 1
 
< 0.1%
3.73 1
 
< 0.1%
3.75 1
 
< 0.1%
3.77 2
 
< 0.1%
3.78 5
< 0.1%
3.8 1
 
< 0.1%
3.81 1
 
< 0.1%
Value Count Frequency (%)
58.9 1
< 0.1%
31.8 1
< 0.1%
10.54 1
< 0.1%
10.16 1
< 0.1%
10.1 1
< 0.1%
9.94 2
< 0.1%
9.85 1
< 0.1%
9.81 1
< 0.1%
9.63 1
< 0.1%
9.59 1
< 0.1%

z
Real number (ℝ)

depth in mm (0--31.8)

Distinct 375
Distinct (%) 0.7%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Mean 3.5387338
Minimum 0
Maximum 31.8
Zeros 20
Zeros (%) < 0.1%
Negative 0
Negative (%) 0.0%
Memory size 421.5 KiB
2023-06-13T11:56:29.361387 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum 0
5-th percentile 2.65
Q1 2.91
median 3.53
Q3 4.04
95-th percentile 4.73
Maximum 31.8
Range 31.8
Interquartile range (IQR) 1.13

Descriptive statistics

Standard deviation 0.70569885
Coefficient of variation (CV) 0.19942129
Kurtosis 47.086619
Mean 3.5387338
Median Absolute Deviation (MAD) 0.57
Skewness 1.5224226
Sum 190879.3
Variance 0.49801086
Monotonicity Not monotonic
2023-06-13T11:56:29.588859 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Value Count Frequency (%)
2.7 767
 
1.4%
2.69 748
 
1.4%
2.71 738
 
1.4%
2.68 730
 
1.4%
2.72 697
 
1.3%
2.67 649
 
1.2%
2.73 612
 
1.1%
2.66 555
 
1.0%
2.74 548
 
1.0%
4.02 538
 
1.0%
Other values (365) 47358
87.8%
Value Count Frequency (%)
0 20
< 0.1%
1.07 1
 
< 0.1%
1.41 1
 
< 0.1%
1.53 1
 
< 0.1%
2.06 1
 
< 0.1%
2.24 1
 
< 0.1%
2.25 1
 
< 0.1%
2.26 1
 
< 0.1%
2.27 1
 
< 0.1%
2.28 1
 
< 0.1%
Value Count Frequency (%)
31.8 1
< 0.1%
8.06 1
< 0.1%
6.98 1
< 0.1%
6.72 1
< 0.1%
6.43 1
< 0.1%
6.38 1
< 0.1%
6.31 1
< 0.1%
6.27 1
< 0.1%
6.24 1
< 0.1%
6.17 1
< 0.1%

Interactions

2023-06-13T11:56:22.811114 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:11.560755 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:13.649494 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:15.828888 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:18.355957 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:20.059698 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:21.477489 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:22.999974 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:11.850079 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:13.952058 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:16.547225 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:18.658190 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:20.242037 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:21.657777 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:23.204030 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:12.155342 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:14.270238 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:16.856580 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:18.981070 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:20.441369 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:21.856087 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:23.385435 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:12.440095 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:14.567889 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:17.139394 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:19.274123 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:20.618100 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:22.033722 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:23.586655 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:12.748470 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:14.891341 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:17.453487 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:19.474570 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:20.811735 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:22.241069 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:23.766974 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:13.039826 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:15.192968 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:17.751584 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:19.664982 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:20.989220 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:22.425293 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:23.950882 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:13.344693 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:15.503723 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:18.058431 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:19.862185 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:21.181877 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
2023-06-13T11:56:22.611384 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/

Correlations

2023-06-13T11:56:29.789941 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
carat depth table price x y z cut color clarity
carat 1.000 0.030 0.195 0.963 0.996 0.996 0.993 0.116 0.135 0.159
depth 0.030 1.000 -0.245 0.010 -0.023 -0.025 0.103 0.406 0.021 0.077
table 0.195 -0.245 1.000 0.172 0.202 0.196 0.160 0.290 0.021 0.050
price 0.963 0.010 0.172 1.000 0.963 0.963 0.957 0.093 0.093 0.145
x 0.996 -0.023 0.202 0.963 1.000 0.998 0.987 0.148 0.130 0.158
y 0.996 -0.025 0.196 0.963 0.998 1.000 0.987 0.108 0.133 0.198
z 0.993 0.103 0.160 0.957 0.987 0.987 1.000 0.096 0.100 0.201
cut 0.116 0.406 0.290 0.093 0.148 0.108 0.096 1.000 0.036 0.142
color 0.135 0.021 0.021 0.093 0.130 0.133 0.100 0.036 1.000 0.079
clarity 0.159 0.077 0.050 0.145 0.158 0.198 0.201 0.142 0.079 1.000

Missing values

2023-06-13T11:56:24.259334 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-13T11:56:24.625018 image/svg+xml Matplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

carat cut color clarity depth table price x y z
0 0.23 Ideal E SI2 61.5 55.0 326 3.95 3.98 2.43
1 0.21 Premium E SI1 59.8 61.0 326 3.89 3.84 2.31
2 0.23 Good E VS1 56.9 65.0 327 4.05 4.07 2.31
3 0.29 Premium I VS2 62.4 58.0 334 4.20 4.23 2.63
4 0.31 Good J SI2 63.3 58.0 335 4.34 4.35 2.75
5 0.24 Very Good J VVS2 62.8 57.0 336 3.94 3.96 2.48
6 0.24 Very Good I VVS1 62.3 57.0 336 3.95 3.98 2.47
7 0.26 Very Good H SI1 61.9 55.0 337 4.07 4.11 2.53
8 0.22 Fair E VS2 65.1 61.0 337 3.87 3.78 2.49
9 0.23 Very Good H VS1 59.4 61.0 338 4.00 4.05 2.39
carat cut color clarity depth table price x y z
53930 0.71 Premium E SI1 60.5 55.0 2756 5.79 5.74 3.49
53931 0.71 Premium F SI1 59.8 62.0 2756 5.74 5.73 3.43
53932 0.70 Very Good E VS2 60.5 59.0 2757 5.71 5.76 3.47
53933 0.70 Very Good E VS2 61.2 59.0 2757 5.69 5.72 3.49
53934 0.72 Premium D SI1 62.7 59.0 2757 5.69 5.73 3.58
53935 0.72 Ideal D SI1 60.8 57.0 2757 5.75 5.76 3.50
53936 0.72 Good D SI1 63.1 55.0 2757 5.69 5.75 3.61
53937 0.70 Very Good D SI1 62.8 60.0 2757 5.66 5.68 3.56
53938 0.86 Premium H SI2 61.0 58.0 2757 6.15 6.12 3.74
53939 0.75 Ideal D SI2 62.2 55.0 2757 5.83 5.87 3.64

Duplicate rows

Most frequently occurring

carat cut color clarity depth table price x y z # duplicates
84 0.79 Ideal G SI1 62.3 57.0 2898 5.90 5.85 3.66 5
0 0.30 Good J VS1 63.4 57.0 394 4.23 4.26 2.69 2
1 0.30 Ideal G IF 62.1 55.0 863 4.32 4.35 2.69 2
2 0.30 Ideal G VS2 63.0 55.0 675 4.31 4.29 2.71 2
3 0.30 Ideal H SI1 62.2 57.0 450 4.26 4.29 2.66 2
4 0.30 Ideal H SI1 62.2 57.0 450 4.27 4.28 2.66 2
5 0.30 Premium D SI1 62.2 58.0 709 4.31 4.28 2.67 2
6 0.30 Very Good G VS2 63.0 55.0 526 4.29 4.31 2.71 2
7 0.30 Very Good J VS1 63.4 57.0 506 4.26 4.23 2.69 2
8 0.31 Good D SI1 63.5 56.0 571 4.29 4.31 2.73 2